Top big data tools used to store and analyze data
Contents |
[edit] Introduction
Big data is a phrase used for a collection of data sets so big and complex that it is difficult to process using traditional applications/tools. Due to the variety of information that it encompasses, big data consistently brings several challenges relating to its volume and complexity.
A recent survey claims that 80% of the data generated in the world are unstructured. One question is how these unstructured information can be structured, before we try to understand and capture the most important data. Another challenge is how we could store it. Listed below are the top tools utilised to store and analyse big data.
[edit] Apache Hadoop
Apache Hadoop is a java-based free software framework that can effectively store great deal of information in a cluster. This frame runs in parallel on a cluster and has an ability to enable us to process data across all nodes. Hadoop Distributed File System (HDFS) is the storage system of Hadoop which splits big information and distribute across several nodes in a cluster. This also replicates data in a bunch thus providing high availability.
[edit] Microsoft HDInsight
HDInsight utilises Windows Azure Blob storage as the default file system. This also provides high availability with reduced price.
[edit] NoSQL
While the traditional SQL can be effectively utilised to handle large quantity of structured data, we want NoSQL (Not Just SQL) to deal with unstructured data. NoSQL databases store unstructured information with no particular schema.
NoSQL gives better performance in storing massive number of data. There are lots of open-source NoSQL DBs available to analyse big data.
[edit] Hive
This supports SQL-like query option HiveSQL (HSQL) to get big data. This may be primarily used for its data-mining function.
[edit] Sqoop
This is a tool which connects Hadoop with various relational databases to transfer information. This can be effectively utilised to transport structured data to Hadoop or Hive.
[edit] PolyBase
This works on top of SQL Server 2012 Parallel Data Warehouse (PDW) and is used to get data stored in PDW. PDW is a data-warehousing appliance built for processing any quantity of relational data and provides an integration with Hadoop allowing the additional provision of non-relational information.
[edit] Big data in Excel
Lots of people are comfortable doing data analytics, therefore, the users may even connect data stored in Hadoop using Excel 2013. The Power View feature of Excel 2013 can be used to easily summarise the information. Similarly, Microsoft's HDInsight enables us to connect to big data stored in Azure Cloud using a power query option.
[edit] Presto
Facebook has developed and recently open-sourced its Query engine (SQL-on-Hadoop) called Presto which is built to manage petabytes of information. Unlike Hive, Presto doesn't depend on MapReduce technique and can quickly retrieve information.
[edit] Related articles on Designing Buildings Wiki
Featured articles and news
A case study and a warning to would-be developers
Creating four dwellings for people to come home to... after half a century of doing this job, why, oh why, is it so difficult?
Reform of the fire engineering profession
Fire Engineers Advisory Panel: Authoritative Statement, reactions and next steps.
Restoration and renewal of the Palace of Westminster
A complex project of cultural significance from full decant to EMI, opportunities and a potential a way forward.
Apprenticeships and the responsibility we share
Perspectives from the CIOB President as National Apprentice Week comes to a close.
The first line of defence against rain, wind and snow.
Building Safety recap January, 2026
What we missed at the end of last year, and at the start of this...
National Apprenticeship Week 2026, 9-15 Feb
Shining a light on the positive impacts for businesses, their apprentices and the wider economy alike.
Applications and benefits of acoustic flooring
From commercial to retail.
From solid to sprung and ribbed to raised.
Strengthening industry collaboration in Hong Kong
Hong Kong Institute of Construction and The Chartered Institute of Building sign Memorandum of Understanding.
A detailed description from the experts at Cornish Lime.
IHBC planning for growth with corporate plan development
Grow with the Institute by volunteering and CP25 consultation.
Connecting ambition and action for designers and specifiers.
Electrical skills gap deepens as apprenticeship starts fall despite surging demand says ECA.
Built environment bodies deepen joint action on EDI
B.E.Inclusive initiative agree next phase of joint equity, diversity and inclusion (EDI) action plan.
Recognising culture as key to sustainable economic growth
Creative UK Provocation paper: Culture as Growth Infrastructure.
Futurebuild and UK Construction Week London Unite
Creating the UK’s Built Environment Super Event and over 25 other key partnerships.
Welsh and Scottish 2026 elections
Manifestos for the built environment for upcoming same May day elections.
Advancing BIM education with a competency framework
“We don’t need people who can just draw in 3D. We need people who can think in data.”
























